import os
# Data directory
homedir = os.path.join(os.path.expanduser('~'), "grass_ncsu_2023")
# GRASS GIS database variables
#grassbin = "grassdev"
grassbin = "grass"
grassdata = os.path.join(homedir, "grassdata")
location = "eu_laea"
mapset = "italy_LST_daily"
# Create directories if not already existing
os.makedirs(grassdata, exist_ok=True)Part 2: Processing data in GRASS
In this notebook we’ll go through the processing of MODIS LST daily time series data to derive relevant predictor variables for modeling the distribution of Aedes albopictus in Northern Italy. Furthermore, we’ll show how to obtain and process occurrence data and background points.
Let’s first go through some temporal concepts within GRASS GIS…
The TGRASS framework
GRASS GIS was the first FOSS GIS that incorporated capabilities to manage, analyze, process and visualize spatio-temporal data, as well as the temporal relationships among time series.
- TGRASS is fully based on metadata and does not duplicate any dataset
- Snapshot approach, i.e., adds time stamps to maps
- A collection of time stamped maps (snapshots) of the same variable are called space-time datasets or STDS
- Maps in a STDS can have different spatial and temporal extents
- Space-time datasets can be composed of raster, raster 3D or vector maps, and so we call them:
- Space time raster datasets (STRDS)
- Space time 3D raster datasets (STR3DS)
- Space time vector datasets (STVDS)
Temporal modules
GRASS temporal modules are named and organized following GRASS core naming scheme. In this way, we have:
- t.*: General modules to handle STDS of all types
- t.rast.*: Modules that deal with STRDS
- t.rast3d.*: Modules that deal with STR3DS
- t.vect.*: Modules that deal with STVDS
Other TGRASS notions
- Time can be defined as intervals (start and end time) or instances (only start time)
- Time can be absolute (e.g., 2017-04-06 22:39:49) or relative (e.g., 4 years, 90 days)
- Granularity is the greatest common divisor of the temporal extents (and possible gaps) of all maps in the space-time cube
- Topology refers to temporal relations between time intervals in a STDS.
TGRASS framework and workflow
GRASS +
In this part of the studio we’ll work with GRASS and Python, so let’s first see/recall the very basics.
Python package grass.script
The grass.script or GRASS GIS Python Scripting Library provides functions for calling GRASS modules within Python scripts. The most commonly used functions include:
run_command: used when the output of the modules is a raster or vector, no text type output is expectedread_command: used when the output of the modules is of text typeparse_command: used with modules whose output can be converted tokey=valuepairswrite_command: used with modules that expect text input, either in the form of a file or from stdin
It also provides several wrapper functions for frequently used modules, for example:
- To get info from a raster, script.raster.raster_info() is used:
gs.raster_info('dsm') - To get info of a vector, script.vector.vector_info() is used:
gs.vector_info('roads') - To list the raster in a location, script.core.list_grouped() is used:
gs.list_grouped(type=['raster']) - To obtain the computational region, script.core.region() is used:
gs.region()
Python package grass.jupyter
The grass.jupyter library improves the integration of GRASS and Jupyter, and provides different classes to facilitate GRASS maps visualization:
init: starts a GRASS session and sets up all necessary environment variablesMap: 2D renderingMap3D: 3D renderingInteractiveMap: interactive visualization with foliumTimeSeriesMap: visualization for spatio-temporal data
Hands-on
So let’s start… We begin by setting variables, checking GRASS installation and initializing GRASS GIS
Now we are ready to start a GRASS GIS session
Explore data in the mapset
Let’s first explore what we have within the italy_LST_daily mapset and display vector and raster maps using different classes from grass.jupyter library.
SDM workflow
In this part of the Studio we’ll be addressing the left part of the SDM workflow, occurrence and background data and predictors:
Importing species records
We will use occurrence data already downloaded and cleaned. We need to import it into GRASS GIS first.
Let’s add the occurrence points over the previous interactive map
You can also get the mosquito occurrences (or any other species or taxa) directly from GBIF into GRASS by means of v.in.pygbif as follows:
Creating random background points
The algorithm MaxEnt that we will use in the next part of this session requires not only the locations of known occurrences, but also information on the rest of the environment available. These are not absences but background data, we actually do not know if the species is there or not, but we need it to compare with the features of the places where the species does occur.
To avoid getting background points exactly where occurrences are, we’ll create buffers around them. Then, we need to ensure that background points are only over land within our computational region. In order to do that, we’ll create a mask over land and we’ll overlay the buffers with the mask. Can you guess what the ooutput will be?
Let’s display the result
Finally, let’s create the random background points…
and display occurrence and background points together over an LST map.
Create daily LST STRDS
Now we’ll start processing the raster data to derive potentially relevant predictors to include in the model. Our data consists of a time series of daily LST averages. We’ll use the GRASS temporal framework for this and the first step is to create the time series object and register maps in it. See t.create and t.register for further details.
Generate environmental variables from LST STRDS
Now that we created the time series or “STRDS”, let’s start estimating relevant variables. We start by calculating long term aggregations, also called climatologies.
Long term monthly avg, min and max LST
Let’s see an example first; we’ll estimate the average of all maps which start date is within January.
If we want to estimate climatologies for all months, let’s try first to get the list of maps that will be the input for t.rast.series, for that we’ll test the condition in t.rast.list first.
Now we add the methods and we are ready to estimate climatologies for all months with three different methods.
# Now we estimate the climatologies for all months and methods
months=['{0:02d}'.format(m) for m in range(1,13)]
methods=["average","minimum","maximum"]
for m in months:
for me in methods:
gs.run_command("t.rast.series",
input="lst_daily",
method=me,
where=f"strftime('%m', start_time)='{m}'",
output="lst_{}_{}".format(me,m))Bioclimatic variables
Perhaps you have heard of Worldclim or CHELSA bioclimatic variables? Well, this are 19 variables that represent potentially limiting conditions for species. They derive from the combination of temperature and precipitation long term averages. As we do not have precipitation data in this exercise, we’ll only estimate the bioclimatic variables that include temperature. See r.bioclim manual for further details. Note that we’ll use the climatologies estimated in the previous step.
Let’s have a look at some of the maps we just created
Spring warming
We define spring warming as the velocity with which temperature increases from winter into spring and we calculate it as slope(daily Tmean February-March-April). We will use t.rast.aggregate.
# Annual spring warming
gs.run_command("t.rast.aggregate",
input="lst_daily",
output="annual_spring_warming",
basename="spring_warming",
suffix="gran",
method="slope",
granularity="1 years",
where=f"strftime('%m',start_time)='{months[0]}' or strftime('%m',start_time)='{months[1]}' or strftime('%m', start_time)='{months[2]}'")Autumnal cooling
We define autumnal cooling as the velocity with which temperature decreases from summer into fall and we calculate it as slope(daily Tmean August-September-October).
# Annual autumnal cooling
gs.run_command("t.rast.aggregate",
input="lst_daily",
output="annual_autumnal_cooling",
basename="autumnal_cooling",
suffix="gran",
method="slope",
granularity="1 years",
where=f"strftime('%m',start_time)='{months[0]}' or strftime('%m',start_time)='{months[1]}' or strftime('%m', start_time)='{months[2]}'")Number of days with LSTmean >= 20 and <= 30
Mosquitoes (and virus they might carry) tend to thrive in a certain range of temperatures. Let’s assume this range is from 20 to 30 °C. Here, we’ll estimate the number of days within this range per year, and then, we’ll estimate the average along years. See t.rast.algebra manual for further details.
Number of consecutive days with LSTmean <= -10.0
Likewise, there are temperature thresholds that mark a limit to mosquito survival. Here, we’ll use the temperature lower limit to survival. Most importantly, we we’ll count the number of consecutive days with temperatures below this threshold.
Here, we’ll use again the temporal algebra and we’ll recall the concept of topology that we defined at the beginning of the notebook. First, we need to create a STRDS of annual granularity that will contain only zeroes. This annual STRDS, that we call annual mask, will be the base to add 1 each time the condition of less than -10 °C in consecutive days is met. Finally, we estimate the median number of days with LST lower than -10 °C over the 5 years.
# Calculate consecutive days with LST <= -10.0
expression="lower_m2_consec_days = annual_mask_0 {+,contains,l} if(lst_daily <= -10.0 && lst_daily[-1] <= -10.0 || lst_daily[1] <= -10.0 && lst_daily <= -10.0, 1, 0)"
gs.run_command("t.rast.algebra",
expression=expression,
basename="lower_m2_",
suffix="gran",
nproc=7)We have now derived many potentially relevant predictors for the mosquito habitat suitability and we could still derive some more, for example, the number of mosquito or virus cycles per year based on development temperature thresholds and growing degree days (GDD). This could be achieved with t.rast.accumulate and t.rast.accdetect.
We will now close this session as we will open it again from R in the last part of this session.